Linguistics 001 Lecture 9
Learning language; animal communication and language evolution

Why we should be impressed by first language acquisition

Do you remember learning to...

tie your shoes?

ride a bicycle?

read?

talk?

While you might recall the first three, it's unlikely you have real memories of learning to talk (as opposed to a few anecdotes of cute utterances, probably because your parents have repeated them to you).

And while you likely received instruction in the first three (if you've learned them at all), you did not require explicit instruction to begin talking.

One of the most impressive achievements of the human child is its ability to learn the complexities of a human language in just a few years, without any formal instruction.

By age 4, children have generally mastered the full grammar of their native language, including a staggeringly large grammar, in spite of the fact that they are quite bad at most other tasks, both mental and physical. Adults past the age of about 15 are mostly incapable of duplicating this feat, in spite of the fact that they are generall quite good at other tasks, like learning algebra or ship-building.

Language acquisition is different from other general learning processes in a number of other ways. Most importantly, perhaps, it seems to require no explicit instruction. Parents seem to help their children a bit in many societies by speaking to them with what is called Motherese, but in many other societies this behavior is lacking, and children do just as well at acquiring their language.

Even in societies where parents make certain conscious efforts to help their children with this task, they don't actually do very much, and for the most part the children actually ignore them.

For example, studies have shown that parents rarely correct their children when they produce ungrammatical sentences. They are mainly concerned with the truth of what the children say, not how they say it.

And when they do try to correct them, the children don't seem to care. Consider the following dialogue between a psycholinguist and his child, reported by Pinker:

Child: Want other one spoon, Daddy.
Father: You mean, you want THE OTHER SPOON.
Child: Yes, I want other one spoon, please, Daddy.
Father: Can you say "the other spoon"?
Child: Other ... one ... spoon.
Father: Say ... "other".
Child: Other.
Father: "Spoon".
Child: Spoon.
Father: "Other ... Spoon".
Child: Other ... spoon. Now give me other one spoon?

Because of this, children must learn their language essentially without any negative evidence. That is, they have evidence of what sentences are in their language, because they can hear what adults say, but they have no evidence clearly telling them what sentences are not in their language.

Recall that the possible sentences of a language are infinite, so it can't be that children simply repeat only those sentences that they actually hear. Rather, they produce and undertand novel sentences on the basis of the grammar that they construct.

Without any sort of head start, the problem of correctly deducing the structure and rules of a grammar on the basis of a finite sample like the one children are exposed to is probably impossible. Rather, it seems that they must have some idea of what they are looking for from birth. This is Chomsky's argument from the poverty of the stimulus in favor of some form of innate universal grammar.

Pinker discusses some of the things that children seem to be born knowing, for example that rules of grammar refer to categories, not specific words, and furthermore that they operate on phrases, not individual elements. They also probably have some idea of the types of categories to expect, i.e. they expect something like nouns and verbs to be found. These things may seem fairly simple, but they are extremely powerful and not entirely obvious. They make the complexity of the language learning problem much more manageable.

But of course, children are not born knowing everything about language. All of the details of the particular language of their community remain to be learned after birth, from basic word order, to inflectional type, to the basic phonemes in the language and the meanings of particular words. (In the lecture on language change we talked about some reasons for why this should be so, i.e. why not all of language is innate.)

One of the central questions being researched in linguistics is just how much the child is born knowing and how much she must learn. Another way to think about the issue is in terms of what all languages have in common and where they differ. That this idea is on the right track is supported by evidence from the types of mistakes that children actually make, or rather the mistakes that they don't make. Things that children do wrong in the language they are learning are typically possible in some languages of the world, but there are other types of imaginable mistakes that children never seem to make. For example, they never try to ask a question about one of two conjoined items:

*Who did Sheila see Mary and?

It is almost certainly not a coincidence that this sort of structure is ungrammatical in all of the world's languages. Children apparently never produce sentences such as these because they are somehow inconsistent with the basic make-up of the human language faculty.

In this lecture we will look more closely at how the language acquisition process works, go through the stages in which it proceeds, and discuss a bit of what it can tell us about human language in general.

The Critical Period

It is well known that children learn languages much more easily than adults. In fact, children don't have to work at learning a language at all; they just have to be exposed to it, and they do it. But adults, even with great effort and long exposure, usually retain at least some foreign accent and imperfect mastery of the grammar when they learn a new language.

The difference between children and adults is generally attributed to the critical period: if you don't learn a particular language as a child, you'll never learn it as easily or as well.

More specifically, ignoring a certain range of individual variation, experiments suggest that:

learning before the age of 7 yields perfect command;

learning between the ages of 8 and 15 yields progressively less perfect command;

learning at a greater age includes no advantage for relative youth.

Several case studies exist to confirm these observations, which can be motivated purely from general patterns as well. In situations of extreme family dysfunction or misfortune, a child might be kept from social and linguistic interaction until a more advanced age. Language ability can be permanently impaired as a result.

Genie, who was isolated from society until the age of 13 1/2, never learned to produce more than telegraphic speech, that is, strings of words with an elementary syntax, but without the full grammatical apparatus of inflection and function words.

Mike paint.
Applesauce buy store.
Neal come happy; Neal not come sad.
Genie have Momma have baby grow up.
I like elephant eat peanut.

Isabelle, on the other hand, was isolated until the age of 6 1/2, and within a year and a half had mastered complex grammar, producing sentences like the following:

Why does the paste come out if one upsets the jar?
Do you go to Miss Mason's school at the university?

The difference in their ages is believed to be the crucial factor in their very different outcomes, and this assumption is confirmed by similar cases.

The critical period resembles other aspects of maturation in humans and animals. Failure to learn various other skills before a certain age makes it difficult or impossible to learn that skill later:

in ducklings: ability to identify and follow the mother

in kittens: ability to perceive visual images

in sparrows: ability to learn the father's songs

Maintaining the neural circuits that allow the acquisition of language and these other skills is costly to the organism, and evolution has favored individuals who lose this costly allocation of resources once the learning has (normally) occurred.

The normal child learns language well before the age of 7 -- just as a duckling normally imprints on its mother right after hatching -- and there is no species-wide need to maintain the costly flexibility throughout the lifespan. Thus it fades, to the regret of modern individuals who would prefer to learn a second language as easily as they learned their first.

Of course, no single explanation need exclude other factors. The course of language acquisition corresponds well to the general rate of metabolic activity in the brain, which peaks at the age of 4 and declines through adolescence. It is difficult to say, however, to what extent this increased activity permits language learning, or is caused by it.

Stages of language learning

In nearly all cases, children's language development follows a predictable sequence. However, there is a great deal of variation in the age at which children reach a given milestone.

Furthermore, each child's development is usually characterized by gradual acquisition of particular abilities: thus "correct" use of English verbal inflection will emerge over a period of a year or more, starting from a stage where vebal inflections are always left out, and ending in a stage where they are nearly always used correctly.

There are also many different ways to characterize the developmental sequence. On the production side, one way to name the stages is as follows, focusing primarily on the unfolding of lexical and syntactic knowledge. The notation X;Y means X years and Y months of age.

Stage

Typical age

Description

Babbling 0;6 - 0;8 repetitive CV patterns

One-word 0;9 - 1;6 Single open-class words or word stems

Two-word 1;6 - 2;0 "mini-sentences" with simple semantic relations

Early multiword 2;0 - 2;6 "telegraphic" sentence structures of lexical rather than functional or grammatical morphemes

Later multiword 2;6 on Grammatical or functional structures emerge

As mentioned above, by the age of four, the child has acquired the grammar of the language.

It is safe to say that except for constructions that are rare, predominantly used in written language, or mentally taxing even to an adult (like The horse that the elephant tickled kissed the pig), all parts of all languages are acquired before the child turns four.

(Dan Slobin, 1985, The Crosslinguistic Study of Language Acquisition; Steven Pinker, 1994, The Language Instinct)

There's evidence that some parts of language -- such as complex morphology, and even certain sounds at times -- are not mastered this early, but in general the fundamentals of the grammar are present by around age four.

We'll look at the details of the stages in more detail below, starting with the basics of speaking.

Vocalizations in the first year of life

At birth, the infant vocal tract is in some ways more like that of an ape than that of an adult human.

Compare the diagram of the infant vocal tract shown above to diagrams of an ape and an adult human.

In particular, the tip of the soft palate (velum) reaches or overlaps with the tip of the epiglottis. This configuration helps prevent choking. As the infant grows, the tract gradually reshapes itself in the adult pattern, which increases the risk of choking but also permits a greater range of sounds to be articulated -- a prerequisite for acquiring the full adult grammar.

During the first 2 months of life, infant vocalizations are mainly expressions of discomfort (crying and fussing), along with sounds produced as a by-product of reflexive or vegetative actions such as coughing, sucking, swallowing and burping. There are some nonreflexive, nondistress sounds produced with a lowered velum and a closed or nearly closed mouth, giving the impression of a syllabic nasal or a nasalized vowel.

During the period from about 2-4 months, infants begin making "comfort sounds", typically in response to pleasurable interaction with a caregiver. The earliest comfort sounds may be grunts or sighs, with later versions being more vowel-like "coos". The vocal tract is held in a fixed position. Initially comfort sounds are brief and produced in isolation, but later appear in series separated by glottal stops. Laughter appears around 4 months.

During the period from 4-7 months, infants typically engage in "vocal play", manipulating pitch (to produce "squeals" and "growls"), loudness (producing "yells"), and also manipulating tract closures to produce friction noises, nasal murmurs, "raspberries" and "snorts".

At about 7 months, "canonical babbling" appears: infants start to make extended sounds that are chopped up rhythmically by oral articulations into syllable-like sequences (consonant plus vowel, or CV), opening and closing their jaws, lips, and tongue. The range of sounds produced are heard as stop-like and glide-like. Fricatives, affricates, and liquids are more rarely heard, and clusters of consonants are even rarer. Vowels tend to be low and open, at least in the beginning.

Repeated sequences are often produced, such as [bababa] or [nanana], as well as "variegated" sequences in which the characteristics of the consonant-like articulations are varied. The variegated sequences are initially rare and become more common later on.

Both vocal play and babbling are produced more often in interactions with caregivers, but infants will also produce them when they are alone. Deaf children do a certain amount of babbling, though of course they can't hear themselves; it thus seems to be somewhat instinctual. Deaf children raised with signing parents "babble" with their hands -- trying out various movements as they learn the specific handshapes etc. of the ambient sign language.

No other animal does anything like babbling. It has often been hypothesized that vocal play and babbling have the function of "practicing" speech-like gestures, helping the infant to gain control of the motor systems involved, and to learn the acoustical consequences of different gestures.

Phonological development

Babbling is the first step in the development of a child's ability to pronounce words and the sounds that make them up. Children tend to follow certain patterns as they learn to produce better and better approximations of adult language -- though there is also a great deal of individual variation.

Simplification of syllable structure

The first syllables that children produce are typically consonant+vowel, or CV. This is the only syllable type that is found in all the languages of the world; it's the "perfect syllable" from the perspective of sonority, so it's not surprising that children master it first. In fact, the opposite relation is probably more to the point: all languages have it because it's the easiest one to learn and to produce.

[k]	"cat"
[bi]	"beat"
[so]	"soap"

It's often useful to think of a child as applying a phonological rule to the adult form, in order to produce the child's output. In this case, the rule is "Delete coda consonants."

Simplification of consonant clusters

Another very common feature of child speech is reduction of clusters of more than one consonant to a single consonant.

[koz]	"clothes"
[bk]	"blanket"
[bp]	"bump"
[tk]	"truck"
[nkIs]	"necklace"
[kul]	"school"
[tayp]	"stripe"

Here the rule could be stated as "Reduce clusters to one consonant." Which consonant survives is a more complex matter, but often it will be a stop consonant (so that a liquid, nasal, or fricative is deleted). Again, this seems to be because stops are the least marked consonants, which means that they are in some sense the easiest to produce and the most common in the world's languages. A less dramatic, but related, reduction is found in words such as bankie for "blanket."

In other cases, a new consonant that combines features of the adult consonants might be used.

[fIn]

"spin"

Here, the alveolar fricative + labial stop becomes a labial fricative. One child is reported to have made this change quite consistently in his speech, so the rule would be /sp/ --> [f].

Deletion of unstressed syllables

Early utterances are often just one syllable in length. Typically it's the stressed syllable that survives in the child's version. This syllable might be subject to other processess, such as cluster simplication.

[ba]	"bottle"
[be]	"baby"
[df]	"daddy"
[wt]	"water"
[chIk]	"chicken"
[wIn]	"window"

Truncation of a word to one syllable is actually quite common in many adult languages: in English, it's how we usually make nicknames (Sue from Susan, Pete from Peter) and other shortenings (vet for veteran, dis for disrespect). Deletion of various unstressed syllables is also one of the most common types of historical change. (Recall our discussion of processes like syncope and apocope in the lecture on language change.)

Sometimes -- especially at a somewhat later stage -- one unstressed syllable can also be preserved. The result is generally a stressed plus an unstressed syllable, with deletion of anything else in the adult form. This structure is called a trochee, or a trochaic foot: a way of organizing two syllables into a stressed+unstressed pair, found as a fundamental element in the prosody of a great many languages, including English.

[nna]	"banana"
[owa]	"granola"
[dedo]	"potato"
[beb]	"belly button"
[no]	"Eleanor"

Many languages use trochees to create shortened words as well: a smaller number of English nicknames are like this, as in Alex for Alexander and Becca for Rebecca.

Prosodic patterns such as trochees -- but also the overall intonation of a phrase -- seem to be grasped by children quite early. Before they start speaking at all, they can distinguish the basic intonations patterns of their parents' language from that of other languages.

Reduplication

When more than one syllable is present in the output, children often show a preference for repeated syllables; in adult language, this is called reduplication and often serves a grammatical function. For example, the plural form of the verb alofa "love" in Samoan is alolofa.

For children, however, it probably reflects difficulty in coordinating two distinct articulations in the same word. Some children prefer words of this shape, while others prefer single-syllable utterances.

Reduplication can be full, in which case the two syllables are identical.

[wawa]	"water"
[baba]	"bottle"
[mama]	"mommy"
[kIkI]	"kitten"
[bb ]	"Patrick"

It can also be partial, in which case the two syllables are partly distinct, with either the consonants or the vowels identical.

[nk] "necklace"

[y] "Andrea"

[mimI] "money"

It's no coincidence, of course, that "baby talk" contains many such words; even when adults use them, they're based on typical child pronunciations.

Consonant harmony, i.e. modifying consonants so that they're the same place of articulation, belongs in this category as well. Like reduplication, it stems from an avoidance of too-complex articulations (or cognitive representations for those articulations), but is especially interesting since it's practically unknown as a phenomenon in adult language.

[gg]	"dog"
[bib]	"Peter"
[kIk]	"chicken"

The end result of consonant harmony is similar to that of partial reduplication; the distinction is in large degree an artificial one, since what really matters is the output that the child is capable of producing.

Substitution

It takes a while for children to learn to pronounce all the individual consonants and vowels of the adult language. Until they do, there are many substitutions of sounds based on what the child is able to say at a particular point in time. These are quite varied, but some patterns are relatively common.

Stops for fricatives or affricates, since the articulation of a narrow opening for a fricative is more difficult than complete closure.

[tIp]	"ship"
[bt]	"bus"
[tIng]	"sing"
[pt]	"pencil"

Glides for the liquids /l, r/ or other consonants.

[wUk]	"look"
[y]	"Andrea"
[twk]	"truck"
[yay]	"light"

Front for back consonants, i.e. alveolar rather than palatal or velar. This makes sense because the tip of the tongue is easier to control, and also easier to have awareness of as the child develops knowledge of the articulators.

[dan]	"gone"
[tIp]	"chip"
[dzIdzI]	"Christmas"

Though the examples given so far are from English, quite similar patterns are found in children learning any language, since they all start out with the same "instinct" for learning language (and the same articulatory limitations). Here are some examples from Finnish, comparing the child and adult forms (Vihman and Velleman 2000).

[ba]	pallo "ball"
[tu]	tuossa "there"
[ki]	ki:nni "closed"
[poi]	pois "off, away"
[pi:lo]	pi:lo:n "there"
[poppu]	loppu "finished"
[kakk^ha]	kukka "flower"
[nanne]	nalle "teddybear"

You should be able to classify these modifications according to those illustrated for English.

The one-word (holophrastic) stage

At about 10 months, infants start to utter recognizable words (though they may differ considerably in pronunciation from the adult form, as we have just seen). Some word-like vocalizations that do not correlate well with words in the local language may consistently be used by particular infants to express particular emotional states.

One infant is reported to have used ah-a-yee to express pleasure, and another is said to have used muh-muh-muh to express distress or discomfort.

For the most part, recognizable words are used in a context that seems to involve naming:

duck while the child hits a toy duck off the edge of the bath;

sweep while the child sweeps with a broom

car while the child looks out of the living room window at cars moving on the street below;

papa when the child hears the father's voice.

Young children often use words in ways that are too narrow or too broad:

bottle used only for plastic bottles

teddy used only for a particular bear

dog used for lambs, cats, and cows as well as dogs

kick used for pushing and for wing-flapping as well as for kicking.

These underextensions and overextensions develop and change over time in an individual child's usage. We'll look at this issue more in a bit.

Since the child's notion of "word" does not necessarily correspond to the adult's, this stage might better be called one-morpheme or one-unit. The point is that the child thinks of the item as a single thing, even if it consists of more that one part for the fluent adult speaker -- such as allgone, a conventional element in English babytalk.

Perception vs. production

Clever experiments have shown that most infants can give evidence (for instance, by gaze direction) of understanding some words at the age of 4-9 months, often even before babbling begins.

In fact, the development of phonological abilities begins even earlier. Newborns can distinguish speech from non-speech, and can also distinguish among speech sounds (e.g. [t] vs. [d] or [t] vs. [k]); within a couple of months of birth, infants can distinguish speech in their native language from speech in other languages. Early linguistic interaction with mothers, fathers, and other caregivers is almost certainly important in establishing and consolidating these early abilities, long before the child is giving any indication of language production abilities.

The "fis phenomenon", from a famous example cited in the literature, exemplifies the difference between perception and production during the development of language.

An investigator was speaking to a child who called his inflated plastic fish a fis.

Adult: This is your fis?

Child: No, my fis. (Rejects repeated imitations.)

Adult: Oh, that is your fish.

Child: Yes, my fis.

Clearly this child can distinguish [s] and [sh] in hearing them, but does not yet distinguish them in pronounciation.

Combining words: the emergence of syntax

During the second year, word combinations begin to appear. Novel combinations (where we can be confident that the result is not being treated as a single word) appear sporadically as early as 14 months.

At 18 months, 11% of parents say that their child is often combining words, and 46% say that (s)he is sometimes combining words.

By 25 months, almost all children are sometimes combining words, but about 20% are still not doing so "often."

Early multi-unit utterances

In some cases, early multiple-unit utterances can be seen as concatenations of individual naming actions that might just as well have occured alone:

mommy and hat might be combined as mommy hat

shirt and wet might be combined as shirt wet

However, these combinations tend to occur in an order that is appropriate for the language being learned:

All dry.	All messy.	All wet.
I sit.	I shut.	No bed.
No pee.	See baby.	See pretty.
More cereal.	More hot.	Hi Calico.
Other pocket.	Boot off.	Siren by.
Mail come.	Airplane allgone.	Byebye car.
Our car.	Papa away.	Dry pants.

Certain expressions, such as allgone, are often treated by children as single units: they are not yet aware of the internal structure. So, similar to the point made above, this stage might better be termed the two-morpheme stage, since the child is combining two elements that s/he considers to be separate meaningful units (regardless of the adult perspective).

Some combinations with certain closed-class morphemes begin to occur as well:

my turn
in there

However, these are the closed-class words such as pronouns and prepositions that have semantic content in their own right that is not too different from that of open-class words. The more purely grammatical morphemes -- verbal inflections and verbal auxiliaries, nominal determiners, complementizers etc. -- are typically absent.

Since the earliest multi-unit utterances are almost always two morphemes long -- two being the first number after one! -- this period is sometimes called the "two-word stage". Quite soon, however, children begin sometimes producing utterances with more than two elements, and it is not clear that the period in which most utterances have either one or two lexical elements should really be treated as a separate stage.

The examples from English are quite representative of what has been found in other languages. At around a year and a half, children exposed to all languages produce two-unit sentences.

German buch da "book there"

bitte apfel "please apple"

wo ball? "where ball?"

Russian baba kreslo "grandma armchair"

daj chasy "give watch"

vady net "water no"

Finnish ei susi "not wolf"

torni iso "tower big"

missd pallo? "where ball?"

Samoan fia moe "want eat"

mai pepe "give doll"

tapale 'oe "hit you"

Though the developing grammars are similarly limited in their ability to produce primarily two units strung together, they overwhelmingly follow the correct word order for the ambient language, so that the specifics of the grammar (in addition to the lexicon) are already developing.

In the early multi-word stage, children who are asked to repeat sentences may simply leave out the determiners, modals and verbal auxiliaries, verbal inflections, etc., and often pronouns as well.

Original Repeated (child)

I can see a cow See cow Eve, 25 months

The doggy will bite Doggy bite Adam, 28 months

Where does Daddy go? Daddy go? Daniel, 23 months

Where is the car going? Car going? Jem, 21 months

The same pattern can be seen in their own spontaneous utterances (which by the age of 2 years often include more than two units).

Pig say oink Claire, 25 months

Kathryn no like celery Kathryn, 22 months

Baby doll ride truck Allison, 22 months

Want lady get chocolate Daniel, 23 months

The pattern of leaving out most grammatical/functional morphemes is called telegraphic, and so people also sometimes refer to the early multi-word stage as the "telegraphic stage".

One way to think about these utterances is that a longer, more adult structure is chopped down to its essential elements (primarily lexical rather than functional words), with the adult order maintained. (Roger Brown, 1973, A First Language: The Early Stages, p. 205; Steven Pinker, 1994, The Language Instinct, p. 273f)

AGENT
Mother VERB
gave RECIPIENT
John OBJECT
lunch LOCATION
in the kitchen

Mommy fix

Mommy pumpkin

Baby cry

Baby table

Give doggie

Put light

Put floor

Tractor go floor

I ride horsie

Give doggie paper

Put truck window

Adam put it box

Again, it appears that the child is partly modeling the fuller nature of the adult grammar, but is not yet able to "fill in all the slots" in potential sentences.

Grammatical elements and corresponding structures

At about the age of 2, children first begin to use grammatical elements. In English, this includes:

finite auxiliaries (is, was)
verbal tense and agreement affixes (-ed, -s)
nominative pronouns (I, she)
complementizers (that, where)
determiners (the, a).

The process is usually a somewhat gradual one, in which the more telegraphic patterns alternate with adult or adult-like forms, sometimes in adjacent utterances:

She's gone.
Her gone school. Domenico, 24 months

He's kicking a beach ball.
Her climbing up the ladder there. Jem, 24 months

I teasing Mummy.
I'm teasing Mummy. Holly, 24 months

I having this.
I'm having 'nana. (banana) Olivia, 27 months

I'm having this little one.
Me'll have that. Betty, 30 months

Mummy haven't finished yet,
has she? Olivia, 36 months

Over a year to a year and a half, sentences get longer, grammatical elements are less often omitted and less often inserted incorrectly, and multiple-clause sentences begin to be used.

For example, the boy Adam progressed in one year from two- or three- word sentences to complex structures, though still with some errors (Brown 1973, Pinker 1994).

2;3 Play checkers.
Big drum.
I got horn.

2;4 See marching bear go?
Screw part machine.

2;5 Now put boots on.
Where wrench go?
What that paper clip doing?

2;6 Write a piece a paper.
What that egg doing?
No, I don't want to sit seat.

2;7 Where piece a paper go?
Dropped a rubber band.
Rintintin don't fly, Mommy.

2;8 Let me get down with the boots on.
How tiger be so healthy and fly like kite?
Joshua throw like a penguin.

2;9 Where Mommy keep her pocket book?
Show you something funny.

2;10 Look at that train Ursula brought.
You don't have paper.
Do you want little bit, Cromer?

2;11 Do want some pie on your face?
Why you mixing baby chocolate?
I said why not you coming in?
We going turn light on so you can't see.

3;0 I going come in fourteen minutes.
I going wear that to wedding.
Those are not strong mens.
You dress me up like a baby elephant.

3;1 I like to play with something else.
You know how to put it back together.
I gon' make it like a rocket to blast off with.
You want to give me some carrots and some beans?
Press the button and catch it, sir.
Why you put the pacifier in his mouth?

3;2 So it can't be cleaned?
I broke my racing car.
Do you know the light wents off?
When it's got a flat tire it's need a go to the station.
I'm going to mail this so the letter can't come off.
I want to have some espresso.
Can I put my head in the mailbox so the mailman can know where I are and put me in the mailbox?
Can I keep the screwdriver just like a carpenter keep the screwdriver?

One of the things that happens as grammar becomes more complex is that the elements of the sentence develop internal complexity.

Big doggie NP = Adj N

Give doggie paper S = V N N

Give big doggie paper S = V NP N

Such hierarchical structure is, of course, a fundamental property of adult language.

Perception vs. production again

Several studies have shown that children who regularly omit grammatical elements in their speech, nevertheless expect these elements in what they hear from adults, in the sense that their sentence comprehension suffers if the grammatical elements are missing or absent.

Learning words

Some typical vocabulary items for a two-year-old child are shown here (O'Grady et al. 1989).

objects
body parts:	cheek, ear, foot, hand, leg, nose, toe
food:	cookie, cereal, drink, egg, fish, jam, milk
clothes:	boot, clothes, dress, hat, shirt, shoes, socks
household:	bag, bath, bell, bottle, box, brush, chair, clock, soap, spoon, water, watch
properties	bad, dirty, fat, good, more, nice, poor, sweet
actions and events	bring, burn, carry, catch, clap, come, cut, do, dry, fall, get, give, go, kick, kiss, knit, look, meet, open, pull, push, ring, shut, sit, sleep, speak, sweep, tickle, wag, warm, wash
other	away, down, now, up, no, yes, thank you, goodbye

Individual children, however, differ considerably in the number of words they know at particular stages, and also what vocabulary they happen to have acquired.

Rate of vocabulary development

In the beginning, infants add active vocabulary somewhat gradually. Here are measures of active vocabulary development in two studies.

	Nelson 1973 (18 children)	Fenson 1993 (1,789 children)
Total of 10 words reached at:	15 months (range 13-19)	13 months (range 8-16)
Total of 50 words reached at:	20 months (range 14-24)	17 months (range 10-24)
Vocabulary at 24 months:	186 words (range 28-436)	310 words (range 41-668)

The two studies cited here used different methods:

The Nelson study was based on diaries kept by mothers of all of their children's utterances.

The Fenson study is based on asking mothers to check words on a list to indicate which they think their child produces.

The reliability of the second type of report is particularly subject to question, especially in its tendency to overestimate the numbers.

There is often a spurt of vocabulary acquisition during the second year. Early words are acquired at a rate of 1-3 per week (as measured by production diaries); in many cases the rate may suddenly increase to 8-10 new words per week, after 40 or so words have been learned. However, some children show a more steady rate of acquisition during these early stages. The rate of vocabulary acquisition definitely does accelerate in the third year and beyond: a plausible estimate would be an average of 10 words a day during pre-school and elementary school years.

Perception vs. production again

Benedict (1979) asked mothers to keep a diary indicating not only what words children produced, but what words they gave evidence of understanding. Her results indicate that at the time when children were producing 10 words, they were estimated to understand 60 words; and there was an average gap of five months between the time when a child understood 50 words and the time when (s)he produced 50 words.

All of these methods (maternal diaries and checklists) probably tend to underestimate the number of words about which young children actually know something, although they also may overestimate the number of words to which they attribute adult-like meanings.

For example, a child may "know" the word doggie, but may also think that it applies to any four-legged creature, including cows.

Grammar understanding is also in advance of grammar use (= production).

For example, in one experiment, babies who spoke only in single words were seated in front of two television screens, each of which featured a pair of adults dressed up as Cookie Monster and Big Bird from Sesame Street. One screen showed Cookie Monster tickling Big Bird; the other showed Big Bird tickling Cookie Monster. A voice-over said,

"OH LOOK!!! BIG BIRD IS TICKLING COOKIE MONSTER!! FIND BIG BIRD TICKLING COOKIE MONSTER!!"

(Or vice-versa.) The children must have understood the meaning of the ordering of subject, verb, and object, because they looked more at the screen that depicted the sentence in the voice-over.

(Hirsh-Pasek &Golinkoff, 1991, "Language comprehension: A new look at some old themes"; as summarized by Pinker).

Thus the children had to know already that the tickler is stated before the verb, or that the one tickled is stated after the verb, even though they had never produced a noun and verb in combination.

Morphological acquisition

Children seem to learn categories of morphemes in a consistent order, with possible minor variations in order and great variation in rate (Brown 1973). The last item is acquired, on average, at about 3 years.

present progressive: (is) playing, (was) singing (19-28 mos.)
prepositions: in, on (27-30)
regular noun plural: toys, cats, dishes (24-33)
irregular past tense: came, fell, saw (25-46)
possessive noun: doggie's (26-40)
uncontractible copula: here I am, who is it (27-39)
articles: a, the (28-46)
regular past tense: played, washed, wanted (26-48)
regular third person: sees, wants, washes (26-46)
irregular third person: does, has (28-50)
uncontractible auxiliary: she isn't crying, he was eating (29-48)
contractible copula: that's mine, what's that (29-49)
contractible auxiliary: he's crying (30-50)

Think about some possible influences on the relative ease with which these morphemes are learned:

contribution to overall meaning (past tense vs. third person)
frequency of irregularity (past tense verb vs. plural noun)
variation in pronunciation (cf. allomorphy of plural and past tense)
identifiability of a morpheme (contracted vs. uncontracted verbs)
complexity of meaning (just past tense vs. third person + singular + present tense)

Many morphemes were not studied for the creation of this list, including irregular noun plurals (feet), inflected adjectives (bigger, biggest), and pronouns (I, we, he, she, they). The list ignores some subtleties also: For example, the allomorph of the plural in dishes (with an inserted vowel) is learned later than the allomorphs in cats and dogs.

Progress backwards

As we've seen in adult language, morphological inflections often include a regular case as well as some irregular or exceptional cases:

walk / walked

open / opened

want / wanted

go / went

throw / threw

hold / held

In the earlier stages of child language, all such words will normally be used in their root form, even if (say) the past tense is the apparent intention.

walk

open

want

throw

hold

As inflections first start being added, both regular and irregular patterns are found.

walked

opened

wanted

went

threw

held

At a certain point, it is common for children to overgeneralize the regular case, producing the forms below as well as eated, maked, finded, hitted, falled, doed, speaked, breaked, goed, runned. Irregular noun plurals are replaced by regular formations such as foots, tooths, childs, mans, mouses, peoples.

goed

throwed

holded

At this stage, the child's speech may actually become less correct by adult standards than it was earlier, because of over-regularization.

This over-regularization, like most other aspects of children's developing grammar, is typically resistant to correction:

CHILD: My teacher holded the baby rabbits and we patted them.

ADULT: Did you say your teacher held the baby rabbits.

CHILD: Yes.

ADULT: What did you say she did?

CHILD: She holded the baby rabbits and we patted them.

ADULT: Did you say she held them tightly?

CHILD: No, she holded them loosely.

Eventually, children resume (and refine) their use of irregular forms as they hear them more often in the speech around them.

However, if an irregular form is not heard frequently, some children may never abandon the regular form that they've created based on their knowledge of the regular pattern. Once enough children do this, the irregular form will drop out of the language.

The Shakespearean example of holp for helped, discussed previously, is one of many examples of irregular inflections that have been dropped from the language.

It is largely the regularizing tendency in child language acquisition that leads to this abandonment of older irregular forms.

Word meanings

As mentioned above, early words typically involve naming, as in:

duck while the child hits a toy duck off the edge of the bath;

sweep while the child sweeps with a broom;

car while the child looks out of the living room window at cars moving on the street below.

But the meanings that young children assume are often difficult to define precisely, since a single word can be used with much else unspoken and deciphered by the adult according to context.

"Dada!"
(hearing a key in the door)	= "Here comes Daddy!"
(handing him a toy)	= "This is for Daddy."
(looking at an empty chair)	= "That's where Daddy sits."
(touching a shoe)	= "This shoe is Daddy's."

Suppose a child is introduced to this object, and learns that it is called apple.

The next items look quite similar in size, shape, and color. The child naturally wonders, are they also apples?

And how about this, which is similar in shape but the wrong color?

Perhaps the parent also calls it an apple. But if various colors are possible, maybe these are apples too?

It should be obvious that a child might need quite a number of examples to be sure just what a word refers to -- and not just seeing, but also tasting or seeing how the object is used. Until that time, the meanings of words that children assume may be considerably different from the adult meanings. These meanings develop and change over time in an individual child's usage, and eventually come to match the adult meaning.

There are two basic ways in which child meanings differ from adult meanings:

overextension: the meaning is too broad, and perhaps refers to a superordinate category (e.g. using dog for "mammal")
underextension: the meaning is too narrow, and perhaps refers to a hyponymous category (e.g. using dog for "collie")

Consider some ways in which a child might misconstrue the meaning of the word doggie:

main referent for child	the family dog, a black labrador retriever
proper extension	members of this species
possible underextensions	black dogs large dogs retrievers
possible overextensions	four-legged animals pets

Underextension is less common, and reflects a certain conservatism in finding a narrower definition that fits the child's early experience of a word. (Some of these spellings represent the child's pronunciation, others don't.)

baba	his bottle only
car	moving vehicles that can be seen (excluding the one ridden in)

Overextension is more common. It is typically based on the appearance of things, rather than their functions (which can vary quite a bit within the class of objects included by the child). In some cases, the meaning has been extended in more than one direction, or in more than one way.

mooi	moon, cakes, round marks on window, round shapes in books, tooling on leather book covers, postmarks, letter O
koko	rooster crowing, music on a violin, piano, accordion, record player, merry-go-round
wauwau	dog, any animal, toy dog, soft slippers, picture of old man in furs
nana	bananas, zucchini
ticktock	watches, clocks, gas meter, fire hose on a spool, scale with round dial
quack	ducks, all birds and insects, flies, coins (with an eagle on the face)
candy	candy, cherries, anything sweet
cookie	cookies, crackers, candies, bags of other sweets or dried fruits, jars, when pantry doors are opened or closed

This special attention to appearance reflects a fundamental tendency in the way children learn words: an object stands out against the background by virtue of its shape, color, texture, movement, etc. and so this perceptually coherent whole what the child assumes is most important in the meaning of the word associated with the object.

Children may use words more liberally when speaking, and yet understand its more limited reference when listening.

For example, if asked to point to the dogs in a picture of many animals, a child might correctly choose the dogs, even if that child uses the word dog to refer to other animals depicted there.

Overextension in production is likely a form of compensation for limited vocabulary, in addition to misunderstanding the reference of a word. This is surely part of the reason that it is so common.

Meaning relations

We can observe increasing complexity in the child's expression of grammatical relations, i.e. subject, verb, or complement of a sentence (Ramer 1977). The following general order for multi-word utterances is quite consistent across children.

1. one grammatical relation, expanded to two elements

[ rocking chair ]
[ want go ]
[ down stairs ]

2. two grammatical relations, each one internally simple

[ mommy ] [ come ]
[ daddy ] [ hospital ]
[ play ] [ sand ]

3. two grammatical relations, one of them expanded

[ her foot ] [ stuck ]
[ the money ] [ inside ]
[ want see ] [ that ]
[ baby ] [ go sleep ]
[ dolly ] [ this carriage ]

4. three grammatical relations

[ mommy ] [ hit ] [ ball ]
[ baby ] [ go ] [ bed ]

5. two grammatical relations, both expanded

[ my mommy ] [ want see ]
[ that hat ] [ on Ernie ]
[ want see ] [ the car ]

Although #4 may seem more complex, since it has three grammatical relations, the fact that it's shorter overall in number of words might explain why it occurs before #5, with just two grammatical relations but a simpler syntactic structure.

Word usage

Even when a child knows the basic meaning of a word correctly, it might be used incorrectly according to adult standards. For example, English is quite liberal in permitting nouns to be used as verbs, and vice versa. This is sometimes called conversion since no affix is used.

I poured the water.	I watered the lawn.
They lit the torch.	They torched the building.
We invited them to lunch.	We sent them an invite.

This can't be done with complete freedom, however; children often overgeneralize the process. A common strategy is to use nouns referring to an instrument or location as a verb.

"You have to scale it first."	("scale" = "weigh")
"I broomed her."	("broomed" = "hit with a broom")

Adults sometimes do this as well, though prescriptively it's frowned upon. Some current bugaboos are "to impact" and "to transition."

Failure to recognize similar subtleties of usage are often found in older children, and even adults (including college students), when they use words that they're only partly familiar with. Here are examples from young children who have seized upon a word in a dictionary definition and used a new, unfamiliar word as if it were entirely equivalent (Miller 1991). This strategy fails to take into account the special meanings and connotations of even near-synonyms.

stimulate: rouse, excite, stir up	"Mrs. Morrow stimulated the soup."
correlate: be related one to the other	"Me and my parents correlate, because without them I wouldn't be here."
redress: set right; repair; remedy	"The redress for getting well when you're sick is to stay in bed."
relegate: send away, usually to a lower position or condition; send into exile; hand over (a task)	"I relegated my penpal's letter to her house."
tenet: opinion, belief, principle, or doctrine held to be true	"That news is very tenet."

The need to see words in sufficient contexts for good understanding of their meanings and usages is one of the reasons you have to read a lot to become a good writer. It's not really so different from young child learning the meaning of "apple" from repeated exposure to examples of appropriate use.

Animal Communication and Language Evolution

Something that human babies find so easy, and human adults find so natural is actually quite beyond the reach of other animals. We will focus first on natural communication among animals. It seems that while these non-human modes of communication are quite interesting in their own right, they do not translate into the equivalent of human language.

But the important point, which Pinker makes over and over, is that this should not be a big surprise or a cause for claims for or against our evolutionary superiority over other animals. Being concerned over whether they are capable of language is like being concerned over whether we are capable of engaging in a peacock mating display.

Chimps and other animals are clearly capable of certain forms of symbolic communication, which we could call language, but it is simply not the same thing as the language that we linguists are concerned with.

This has nothing to do with claims about some hard line separating humans from other animals or about levels of "evolutionary advancement" or even relative intelligence levels. Quite simply, chimps and gorillas don't and can't learn and use human language because they aren't human. That's all.

What natural animal communication is like

When a dog -- or a wolf or a coyote -- lowers its forelegs to the ground and waves its tail, this is a way of saying, "everything that follows is just a game. Are you ready to play?" Ethologists (students of animal behavior) call this the Canid Play Bow. Cats apparently don't understand this bit of dog metalanguage, though most dog owners do.

The Canid Play Bow is an example of what ethologists call a display, meaning roughly a salient pattern of behavior.

Displays are modulated in intensity or clarity of presentation, with similarly gradient implications for their "meaning." For instance, the eastern kingbird has a vocalization said to sound like "zeer" that indicates possible aggressive intent. The sharpness and harshness of the call varies, and the probability that the animal will actually attack increases as the sharpness/harshness of the call increases.

This is not typical of grammar in human language, merely of emotional display in the use of language (such as shouting when angry, etc.)

Sometimes displays are quite specific and complex in their behavioral structure and also in their apparent function. Other displays have a specific form, but their function -- or meaning -- that is more difficult to define.

For instance, the blacktailed prairie dog has a display called the Jump-Yip, which involves throwing the forepaws into the air, pointing the nose straight up, and emitting an abrupt two-part vocalization. The Jump-Yip seems to mean "I'm trying to decide whether to run away or do something else that might be more important to me." A bit less than half the time, a Jump-Yip is followed by flight, but there always seems to be an alternative: to make sexual advances to another prairie dog, to challenge a neighbor aggressively, to take a dust bath, or whatever. The display is only used when there is an alternative to flight, but the alternative could be just about anything.

More often, displays are fragments of behavior that may be combined in various ways, and may shade gradually into what may be considered an entirely different display.

For example, here is one ethologist's version of the system of dogs' body language -- changes in body postures and facial expressions by which dogs indicate their feelings and intentions to other dogs -- of which the "play bow" is one part:

A-B:
Neutral to alert attentive positions

C:
Play-soliciting bow

D-E:
Active and passive submissive greeting -- tail wags, ears fold back, weight is transferred to hind legs

F-H:
Gradual shift from aggressive display to ambivalent fear/defensive/aggressive posture

I:
Passive submission

J:
Rolling over and showing belly

While these postures communicate meaning, they are not capable of the open-ended meanings of human language, and cannot be used with negation, or past tense, etc.

There are other cases, for instance warning calls, where displays may have a range of fairly specific referential content.

For instance, vervet monkeys give one kind of alarm call when they see a snake, another kind when they see a leopard, and a third kind when they see an eagle. This set of referential distinctions is useful, because different kinds of defensive behavior are appropriate in the three cases.

In response to the snake warning call, the troupe of vervets will all stand up on their hind legs in the open and look around on the ground -- to find the snake. If the predator were an eagle or a leopard, standing up in the open would be very unwise.

In response to the leopard warning call, the members of the troupe run up to the top of the nearest tree, where the heavy leopard can't follow them. This would be just the wrong thing to do to avoid an eagle, and not optimal for a snake either.

In response to the eagle warning call, the members of the troupe run into a nearby bush or under the lower branches of a nearby tree. Since this is where leopards hide while stalking, and is as likely as anywhere to be on top of a snake, this behavior is again a good response only to the type of predator that the particular call signals.

There is some evidence that vervets show skepticism toward unreliable members of the group, and pay less attention to their warnings; and also that the calls can be used deceptively. But for the most part, the calls seem to be an automatic reflex.

Types of display

Just about anything that can be controlled can become part of a "display."

Electric eels and fish use electromagnetic signaling in courtship and establishment of territory. (They typically have very limited vision, so that mode of communication is less helpful.)

Cuttlefish and other cephalopods (in the squid family) have chromatophores and other specialized cells in their skin that allow them to change its color and texture in complex patterns.

In one species, observers have catalogued 31 distinct full-body patterns with behavioral significance.

For instance, male cuttlefish flash a zebra-striped pattern in circumstances appropriate for mating.

Females generally remain mottled, but if a female turns a uniform gray, it signals that she is ready to mate. The dominant male will approach the female and also turn a uniform gray. If another male approaches, the male involved in the mating will turn on zebra stripes on the side facing the potential rival, while retaining his sexy gray color on the side facing the female.

Bee dancing

The "dances" that honeybees (Apis mellifera) use to convey information about locations of nectar sources were first described by Karl von Frisch (1886-1982). He shared the 1973 Nobel Prize for Physiology or Medicine for his work.

When a scout bee returns from foraging, she crawls onto the vertical combs near the nest entrance and dances for up to several minutes.

The main dance performed by honeybee scouts is the waggle dance, which can be thought of as a "reenactment" of the journey. That interpretation certainly fits with a major theory of animal communication as the ritualized version of other behavior.

The dance consists of running through a small figure eight pattern repeatedly. (Shown here with onlookers.)

The informative portion of the dance is the straight run where the dancer vigorously vibrates (waggles) her abdomen back and forth laterally and emits strong substrate and airborne vibrations in addition to audible (to humans) buzzes.

A straight run followed by a turn to the right to circle back to the starting point, another straight run, followed by a turn and circle to the left, and so on in a regular alternation between right and left turns after straight runs constitutes the dance.

The orientation of the straight run conveys the direction of the food source, relative to the position of the sun.

Flowers located directly in line with the sun are represented by waggle straight runs in an upward direction on the vertical combs, and any angle to the right or left of the sun's position is coded by a corresponding angle to the right or left of vertical.

The duration or tempo of the straight runs encodes the distance between nest and target: Dance tempo slows down with increasing distance to the food source.

The farther away the target, the longer the straight runs, with a rate of increase of about 75 milliseconds per 100 meters.

The intensity of the dance encodes the quality of the food source.

It is important to note that von Frisch's analysis of the bee dance remains controversial. In particular, some claim that odor plays a primary role, rather than any information encoded in the dance. (Floral odors cling to the returning bee, which has also left pheromones at the nectar site.)

The dances also do not work with great reliability. Many bees simply fail to find the food source or may stumble upon it by following floral aromas or even other bees in flight to an from a rich floral patch. Remember that honeybees are after all just insects with very small brains; they have no mysterious abilities that other social bees or other insects do not possess.

Communicative functions in animals

We've already seen some of the functions of animal displays, vocal or otherwise: warning of predators, attracting mates or scaring off sexual rivals, signaling aggressive or submissive attitudes.

We can try to make a more complete list of functions. Here is one, adapting from a text on the topic:

To advertise individual identity, presence, and behavioral predispositions.
To establish social hierarchies without the disruption of actual fighting or fleeing.
To synchronize the physiological states of a group, e.g. for breeding.
To monitor the environment collectively for dangers and opportunities
To synchronize organized activities (hunting or foraging, migration, etc.)
To discourage predators (e.g. extravagant leaping and kicking by some antelopes)
To lure prey (e.g. predatory mimicry by fireflies)

Evolution of animal communication

Most animal displays are thought to develop by a process of ritualization of previously existing behavior. A selective advantage may come to exist for those individuals who use certain behaviors in a way that is partly different from the norm, thereby changing the norm in a certain direction.

For instance, urine tends to smell; this can lead to specialized use to mark territory, or develop into chemical sex attractants, resulting in the development of urination behaviors that have nothing to do with the need to void wastes.

Other displays develop from ritualization of intention movements, i.e. incomplete acts, such as turning away or starting to fly.

A related example is the development of frowning from protective eyebrow-lowering, which presumably was originally part of more general preparations for attack or defense.

Movements may be redirected or displaced from their normal context, repeated unnecessarily, etc. as part of ritualization.

We'll discuss evolutionary perspectives on human language in the next lecture.

Instinctive vs. learned patterns

The patterns of animal communication seem to be largely, though not entirely, genetically determined.

All vervet monkeys use the same set of warning cries. However, the particular creatures that evoke them vary from place to place, depending on the local types of predators. For humans, of course, no words are innate -- all are learned from the ambient language, and it's the ability to learn an unlimited number of words that's remarkable in this context.

Baby vervets start out pretty undiscriminating, giving "eagle" warning calls to any reasonably large bird, including those that are completely harmless, and "leopard" warning calls to any largish terrestrial animal, including grazing animals, anteaters etc. (This is certainly reminiscent of overextension of meaning by human children.) The vervet children gradually learn -- apparently from the reactions of their elders -- how not to "cry wolf."

Most animals learn to respond appropriately to the warning cries of other species that happen to live near them. Thus the ability to understand simple communications is not innate to particular species.

Some species -- for instance some songbirds -- have "dialects", in which some properties of their vocalizations or other displays are learned, during a critical period early in each animal's life. The resulting pattern of geographical variation in the vocal displays then plays a role in social structure -- just like human shibboleths.

It is said that such dialect variation tends to occur (in songbirds) just in case the species colonizes a wide range of microhabitats, where each local group becomes especially adapted to their local conditions, and will benefit from breeding within the group to maintain the adaptations intact.

A more elaborate form of social evolution occurs in the songs of humpback whales.

Male humpbacks "sing" long (half-hour) and complex "songs", apparently as part of lek behavior, also called arena behavior.

Lek behavior is a kind of courtship exhibited by many species of insects, birds, and mammals -- sometimes including humans at a singles bar.

A number of males gather in a defined area, each in his own piece of the space, and engage in competitive mating displays.

Females cruise through the arena, observe the displays, and choose a mate based on their evaluation. (There is no lasting pair bond for such species.)

In the case of the humpbacks, the male display appears to be a song. Humpback songs have been recorded (by Navy sonar engineers) since the WWII era. As a result, the following curious pattern has been observed.

At any given time, all the humpbacks in a given region are singing basically the same song. Individual performances may differ in small ways -- an extra trill here, a note left out there, a section repeated somewhere else -- but the basic pattern is the same. Over time, the pattern evolves, until after a year or so, it is completely changed. All the males are still singing the same song as one another, but it is a completely different song than it was before.

After about 50 years of recordings, the old songs have never come back into fashion. The evolution appears to be open-ended.

Humpbacks around Bermuda sing completely different songs from humpbacks around Hawaii, even though they seem to be entirely the same species.

As far as anyone can tell, the patterns in the songs have no meaning at all. They are just abstract patterns. Pure style, no content -- in this sense, at least, like human melodies. Thus they seem to be singing rather than talking.

It remains unclear to what extent the development of the song is a matter of social construction -- everyone following a gradually shifting fashion -- and to what extent it is implicit in a given stage of the song -- that is, the "next thing" is obvious to any humpback with an ounce of style.

In any case, it is tempting to analyze this phenomenon by reference to the human phenomenon of fashion. You can -- and must -- distinguish yourself by a slightly individual interpretation of the current style.

Grammar -- without communication

There are some forms of complex behavior that do not really seem to be properly viewed as displays, since they are done just as often by solitary animals, conspecifics seem to pay no attention to them, etc. Some such behaviors nevertheless seem to have a complex, quasi-grammatical structure.

The best-studied case is mouse grooming.

Mice use their forepaws to spread secretions from oil glands near their eyes. They do this via a rapid series of motions, which may be forehand or backhand, up or down, from the midline to the side or vice versa, etc. The particular order of such gestures has no functional consequence, as long as there are enough of each type to spread the secretions around.

Any given mouse uses a stereotyped sequence of such gestures -- which can be appropriately described with a finite-state grammar -- but different mice use different sequences. Genetically identical mice use identical sequence patterns.

Thus this is a case where a grammar-like organization of behavior is found, though there seems to be no communicative significance. So beware of claims that rely on apparent grammar to argue for actual language, which uses grammar to link form and meaning.

The organization of humpback whale songs is different in some ways -- it changes rather than being genetically fixed, and other whales seem to be paying attention to its structure. However, whale song syntax is similar to mouse grooming syntax, in that (as far as we know!) the structure does not seem to have any significance other than itself.

Size of the repertoire

It is generally said that individual animals may have a repertoire of up to about 40 different displays. One authority says:

For most relatively social adult fishes, birds and mammals, the range or repertoire size for different species varies from 15 to 35 displays.

Curiously, there is little correlation with location in the "great chain of being." As far as we know, cuttlefish, related to squids, have about the same repertoire size as non-human primates do.

The hypothesis advanced about this constancy of repertoire size is that there is a sort of dynamic equilibrium involved -- old displays tend to be lost as new ones are added.

Presumably there are some pressures to prevent the accumulation of displays:

they may become ecologically irrelevant;

there may be too much investment required in the brain circuitry involved;

it may be confusing to have too many "lexical items" in the system, lessening the effectiveness of each.

One clear evolutionary puzzle is why -- if communication systems are as useful as we generally think they are -- the rest of the animal kingdom seems to have put so little emphasis on elaborating them beyond this size of a few tens of hard-wired elements.

Primate language instruction

We can't address the issue of non-human communication in a linguistics course without venturing briefly into the vexed question of whether certain apes have been taught "language."

There have been various attempts to see if close relatives of humans such as chimpanzees and gorillas can, under the right circumstances, acquire some form of language.

This is a large and controversial topic. We'll just mention a few points here.

Since apes do not have the right vocal tract configuration to make the sounds of human language (recall the descent of the larynx in humans that made language and choking both easier), some other mode of communication had to be found.

Most apes in these experiments were taught sign language, generally a version of ASL or a simplified sign language based on it.

Others were taught largely arbitrary visual symbols, either as physical objects or as part of a computer keyboard and monitor set-up.

Other points of variation have been the setting for training (in a home or in a lab, with other apes or alone) and the species of ape (gorilla, chimps of various types).

There seems to be little doubt that the apes were able to associate meanings with the various symbols. But to what degree have they been able to combine the items in new ways? This is a hallmark of human language.

The evidence for novel combinations, or (more daringly) simian syntax, is weak. Much of it is anecdotal and relies on picking out a few impressive moments from a large corpus of noise.

To take one example, the chimp Lana was reported to utter this sentence when presented with an orange, for which she had not yet learned a symbol.

apple which-is orange-color

This seems to illustrate appropriate extension of apple to cover other fruit, as children do.

But look at the entire context (in which the trainer Tim is holding an orange):

Tim What color of this ?

Lana Color of this orange

Tim Yes

Lana Tim give cup which-is red

Tim Yes

Lana Tim give which-is shut ?
Shelley give ?

Tim No Shelley

Lana Eye
Tim give which-is orange?

Tim What which-is orange

Lana Tim give apple which-is green ?

Tim No apple which-is green

Lana Tim give apple which-is orange ?

Tim Yes

The seeming novel combination occurs in a sequence of rather fixed expressions, in a pattern which may have been learned by rote, plus others that don't make much sense. And importantly, it wasn't Lana who introduced the notion of color here, it was Tim the trainer who did.

Pinker spends a good deal of time discussing the problems with the now-famous claims that apes like Lana had learned human language. One of the important points of confusion seems to have been a lingering misunderstanding among non-signers of what sign-language is. It is not a pantomime of iconic gestures and pointing, but a full-fledged language made up of arbitrary symbols. One of the most damning critiques of the claims of the early primate sign language researchers, which is reported in Pinker, comes from a deaf signer on the team that worked with Washoe -- the only person on that team who was a native speaker of the ASL that they were supposed to be teaching the chimp:

Every time the chimp made a sign, we were supposed to write it down in the log ... they were always complaining because my log didn't show enough signs. All the hearing people turned in logs with long lists of signs. They always saw more signs than I did ... I watched really carefully. This chimp's hands were moving constantly. Maybe I missed something, but I don't think so. I just wasn't seeing any signs. The hearing people were logging every movement the chimp made as a sign. Every time the chimp put his finger in his mouth, they'd say "Oh, he's making the sign for drink," and they'd give him some milk ... When the chimp scratched itself, they'd record it as the sign for scratch ... When [the chimps] want something, they reach. Sometimes [the trainers would] say, "Oh, amazing, look at that, it's exactly like the ASL sign for give!" It wasn't.

There can certainly be no doubt that chimps and gorillas are very intelligent, but getting at the cognitive structures in humans is hard enough; it's all the more difficult for non-humans.

In particular, the temptation to anthropomorphize is extremely strong, since humans by their very nature attribute various knowledge and intentions to other people; but as we'll see in a moment, whether this knowledge is actually present is the crux of the matter.

How human language seems to be different

You should be able to fill this section in for yourself, based on the content of this course.

The basic issues are:

the phonological principle, so that words consist of strings of a relatively small number of discrete elements, combined in different ways;

the elaboration of a large (and variable) lexicon, containing words from which new words can be derived, and to which an unrestricted supply of completely new words can be added;

syntactic principles for combining words and other morphemes into sentences, consisting typically of a verb and its participants; and

the existence of a hierarchical compositional semantics, where the meaning of the whole is a function of the meaning of the parts in some way more complex than simple summation of the correlates of independent displays.

All of these elements seem to be missing from natural communication among non-human animals, including primates. They have also not been demonstrated among primates to whom researchers have attempted to teach something like human language.

In addition, human communication -- by means of spoken language or in other ways -- seems to involve a much better-developed theory of mind, involving more complex and abstract models of others' knowledge, beliefs, intentions and goals, and much more complex and systematic use of these models to plan sequences of communicative acts.

Indeed, it remains a matter of debate whether other animals can be said to have a "theory of mind" at all, as opposed to innate automatisms (like gaze following) or learned reactions that make it seem in some cases as if they were communicating in the same way.

At the same time, human facial expression, (aspects of) human "body language", and (aspects of) human voice modulation in speaking, seem quite similar in kind to animal displays, and indeed similar in detail in many cases to the expressions and gestures of other primates.

The British philosopher H.P. Grice (1913-1988) argued that notions such as "to communicate" and "to mean" can only be understood in terms of multiple layers of intentions and beliefs on the part of conversational participants.

Thus for the sentence "she means P" to be appropriate, someone has to use that sentence with

the intention of making an audience believe that she believes P, and also with

the intention of making them believe that she used the sentence with the intention of making them believe that she believes P,

and indeed with any number of even more complex conditions on various parties' beliefs and intentions.

Grice went on to argue that the normal operation of this process in human conversation is based on some assumptions about norms of cooperative behavior (as we discussed previously in the lecture on pragmatics) and that these norms become an active part of the (very complex) reasoning process that speakers and hearers use in the join construction of conversational meaning.

Gricean issues are front and center in discussions of animal communication. Thus Cheney and Seyfarth on "attribution" in monkey and ape communication:

To attribute beliefs, knowledge and emotions to both oneself and others is to have what Premack and Woodruff (1978) term a theory of mind. A theory of mind is a theory because, unlike behavior, mental states are not directly observable. [. . .]

[E]ven without a theory of mind, monkeys are skilled social strategists. It is not essential to attribute thoughts to others to recognize that other animals have social relationships or to predict what other individuals will do and with whom they will do it. Moreover, it is clearly possible to deceive, inform, and convey information to others without attributing mental states to them. [. . .]

However, the moment that an individual becomes capable of recognizing that her companions have beliefs, and that these beliefs may be different from her own, she becomes capable of immensely more flexible and adaptive behavior. [. . .]

Most of the controversy surrounding animal communication. . . centers on second- and third-order intentionality -- whether animals are capable of acting as if they want others to believe that they know or believe something. . . Higher-order intentionality implies the ability to attribute knowledge, beliefs and emotions to others. Attribution, in turn, demands some ability to represent simultaneously two different states of mind. To do this an individual must recognize that he has knowledge, that others have knowledge, and that there can be a discrepancy between his own knowledge and theirs.

It's worth adding that one also needs the ability to use this understanding in the complex reasoning required to plan meaningful acts (speech- or otherwise), and to interpret the acts of others. Such ability to manipulate higher-order intentionality in communicating seems to be one of the innovations of the last couple of million years of hominid evolution.

A caveat: We are almost certainly missing a lot of the subtleties of animal communication, despite years of careful and patient observation by ethologists. But no persuasive evidence of language-like communication has been found, despite considerable effort.

Whether one chooses to emphasize the continuities or the divergences between human and non-human forms of communication is to some extent a matter of taste.

However, those whose interest is the form and meaning of human languages do not, in most cases, find very many points of continuity with the communicative practices of non-human animals.

Some further references:

Are Non-human Species Capable of Language Acquisition? by Ilanit Tof

Evolution of animal communication in Britannica.com

For more discussion of the two sides of the primate language argument, see this article from the New York Times and this overview by Dave Switzer.

The evolution of human language

We've seen then that, there's nothing quite like human language in the communication of other living animal species (just as there is nothing quite like an elephant's trunk). In the second half of this lecture we take a look at the possible evolutionary origins of language. (They were certainly not as early as this cartoon suggests.)

Where did human language come from, and why? We tend to view the quirks and peculiarities of our species as The Right Thing to Do, and assume that if other species don't do the same, it's because they just haven't evolved to our level. Surely our most complex and perfected language must be what all other species aspire to, mounting a scale of communicative complexity from worms to insects to fish to birds to mammals to us.

However, as mentioned above, the number of communicative displays in a given animal's repertoire ranges from about 15 to 35, no matter what the species. Curiously, there appears to be little correlation between repertoire size and location in the "great chain of being." Cuttlefish, as far as we know, have about as many different communicative displays as chimps do.

Biologists who study the evolution of behavior speculate that there are selective pressures that prevent the overall number of displays from growing beyond a certain point, even though it is clear that new displays are developed to suit new adaptive circumstances. In the same way, although specialized physical organs develop in response to new evolutionary opportunities, other specializations tend to be lost, so that creatures do not over time accumulate indefinitely many humps, horns, ruffs, claws and so on.

Thus human language, with its hundreds of thousands of words, is not just the logical endpoint of some obvious evolutionary scale. Rather, it seems to be a behavioral counterpart of the peacock's tail or the elephant's trunk: a specific, enormously hypertrophied development of structures with rather different original functions.

How and why did this happen? If complex systems of communication are so great, why hasn't evolution been developing them in other species for the last few hundred million years -- as eyes, ears, horns, claws etc. have repeatedly been developed?

Apparent design features of human spoken language

We can list a few characteristics of spoken language:

Large vocabulary: 10,000-100,000 items
Open vocabulary: new items are added easily (especially for content words, but even for function words)
Variation in space and time: different languages and "local accents"
Messages are typically structured sequences of vocabulary items

Compare what is known about the "referential" part of the vocal signaling system of other primates:

Small vocabulary: usually about 10 distinct items
Closed vocabulary: new "names" or similar items are not added
System is fixed across space and time: widely separated populations use the same signals
Messages are usually single items, perhaps with repetition

Some general characteristics of other primate vocalizations that are shared with human speech:

Vocalizations communicate individual identity
Vocalizations communicate attitude and emotional state

Some potential advantages of the human innovations:

Easy naming of new people, groups, places, things, etc.
Ways of referring to past, present, and future
Signs for an arbitrarily large inventory of abstract concepts
Language learning is a large investment in social identity

So human language is not obviously just an extension of some general communicative faculty found in other animals. Where did it come from, then, and how did it develop?

Some earlier ideas

In the 19th century, there was a great deal of speculation among scholars about the origins of language. These were on the whole rather primitively formulated, and came to be known by sarcastic names.

The Bow-Wow theory

Language originated in onomatopoeic words that mimicked the sounds made by the things they described, such as animal calls.

The Pooh-Pooh theory

Language originated in words derived from reflexive sounds used to express human emotions such as pain and anger.

The Ding-Dong theory

Language originated in natural connections between sound and meaning, such as imitation of physical sounds (similar to the bow-wow theory).

The Yo-Heave-Ho theory

Language originated in words based on grunts and groans of exertion, as in rhythmic chants that helped people work together.

This list is by no means exhaustive.

Darwin was associated with, for example, the Pooh-Pooh theory:

[Anyone] fully convinced, as I am, that man is descended from some lower animal, is almost forced to believe a priori that articulate language has developed from inarticulate cries.

Oxford philologist Max M|ller, following Descartes' view of the essential difference between humans and animals (and sometimes said to be a proponent of the Ding-Dong theory), declared language to be "the Rubicon that no brute will dare to cross."

In 1876, M|ller's followers persuaded the Linguistic Society of Paris to ban all presentations on language evolution from its meetings and publications.

There was relatively little interest in the topic for the next 110 years or so. Some possible reasons:

Cartesian uneasiness about trespassing on the domain of the mind.
Lockean uneasiness about a specific biological substrate for language learning.
Unusual difficulty of investigation due to apparent discontinuities between animal and human communication.

Meanwhile, quite a bit of background scientific understanding was developing:

general hominid paleontology
the functioning of the larynx
(paleo-)neurology
ethology and the role of vocal displays: bird song, whale song
studies of primate behavior, social organization, vocal signaling
evolutionary theory

More recently, with the advent of evolutionary psychology, there has been considerable renewed interest in the question of the origins of language.

Is language in our genes?

Does it make sense to ask about the genetic evolution of language?

The essence of life is the transmittal of genetic information. Words like "communication" are sometimes used to talk about the expression of genetic information within the cell, and the transmittal of genetic information to new cells. There are good reasons for these verbal analogies -- there are interesting mathematical affinities between computational linguistics and computational biology.

However, we share this genetic language with every other living thing on earth, while it is only our fellow humans that we can talk with. Although molecular genetics is not the kind of "language" we are investigating, it provides one framework for interpreting our question about the origins of human spoken language. We can ask: what aspects of the human genome make spoken language possible? What selective pressures on our ancestors led these characteristics to develop?

It's conceivable that looking for this genetic basis of human language will not be very enlightening, in particular if language is not directly encoded in the genes, but arises from more general genetic material. For example, we would not learn much by asking for the genetic basis of certain other uniquely human traits, such as the practice of wearing baseball caps backwards. There are things we could say -- humans have heads, for instance, and a tendency to be dazzled by sunlight when looking for things in the air on bright days, whence hats with brims -- but in fact the main issues are cultural, not biological. The human species has not adapted genetically to wearing caps, whether forwards, backwards, or sideways. Instead, the design and use of caps has "evolved" as part of the culture of a particular time and place, among people no different genetically from those with very different tastes in headgear.

Indeed, human language and culture are deeply interconnected, to the point that it would absurd to study the evolution of language without considering its role in broader social and cultural questions. Genetics and physical evolution alone cannot tell the whole story. However, in contrast to behavior like wearing baseball caps, it is clear that the human species has in fact adapted genetically to facilitate the use of spoken language. Thus it is worthwhile to look into what these adaptations are, and also at some theories about what selective advantage they offered to our ancestors.

Human ancestry

We are talking about evolution during the roughly five million years ("myr" in this chart) since we separated from the ancestors of today's great apes (chimpanzee, gorilla, etc.).

The next chart gives a general overview of this long period of hominid evolution.

We are going to sidestep several controversies:

how many distinct species should be recognized in the fossil record (expert opinions vary from three to fifteen)?
where along the line from Australopithecus to Homo erectus (about 1.8 million years ago) to Homo sapiens (about 100,000 years ago) did how much of the various changes take place?
where on the family tree do various particular species or subspecies (e.g. the Neandertals) fit in?
did the recent change from erectus to sapiens happen in one place (the "out of Africa hypothesis") or over a wide area (the "multi-regional hypothesis")?

The language-related changes took place from the neck up. These changes took place in two areas: the mouth and throat (i.e. the vocal tract), and the brain.

Vocal tract changes in hominid evolution

One set of changes occurred between neck and nose, and served to adapt our vocal tracts for speaking. Specifically, as we saw in the lecture on phonetics and phonology, we shortened our muzzle and the oral cavity it contains, and stretched out our pharynx (throat, in ordinary language) by lowering the larynx (what is behind the Adam's apple).

The comparison below of chimpanzee and human vocal-tract anatomy shows the changes with labels.

The picture below shows that the skull of Homo erectus, one of our recent ancestors (or something like it) who lived between about 1.8 million and 100,000 years ago, who appears to be intermediate in these respects between the great apes and our esteemed selves. (With the loss of soft connective tissue, it's a tricky thing to reconstruct the exact location of the larynx, though various types of indirect evidence have been used.)

The result of these changes is to make it possible for our tongue to move forward and back, up and down, in a way that creates resonant cavities of different sizes in various places in the vocal tract, as this synthesis demonstration shows.

Aside from helping with the vowels, however, these changes are a bad idea! The expansion of the pharynx creates some real problems. For instance, it means that laughing while drinking tends to propel liquids out the nose. Much more seriously, it's relatively easy for us to get a chunk of food lodged in the larynx, with potentially fatal results. To quote from Holloway 1996, The evolution of the human vocal apparatus:

The lower position of the larynx alters dramatically the way humans... breathe and swallow. The loss of the ability of the epiglottis to make contact with the soft palate means that the possibility of having two largely separate pathways, one for air and one for liquid, no longer exists. The respiratory and digestive tracts now cross each other in the area of the pharynx... This new configuration can, and does, have unfortunate drawbacks. The major problem is that a bolus of food can become lodged in the entrance of the larynx. If this material cannot be expelled rapidly an individual may literally choke to death... Another disadvantage of the crossed pathways is the relative ease with which vomit can be aspirated into the trachea, and thus pass into the lungs.

This problem is worse for men than for women, because as a secondary sexual characteristic of male humans, the larynx increases in size and moves even lower in the throat at puberty. (That's why a boy's voice cracks when he's going through puberty.) None of the other great apes show this laryngeal sexual dimorphism, or indeed any other vocal tract dimorphism -- though they have much greater dimorphism in overall size, and also show dimorphism of canine teeth, which humans lack entirely.

The unique human development of sexual dimorphism in larynx size and position presumably means that vocalization is important to us in ways that it is not to gorillas and chimps.

Why? Because normally voice pitch correlates with larynx size which correlates with body size. If vocal signaling is significant in courtship, then an individual with a larger larynx generally signals a larger size.

Result: Many species with vocal signaling and male/male competition in courtship show sexual dimorphism in vocal organs. If the signaling system produces sounds whose pitch is size-correlated, then the dimorphism is in a direction that produces lower pitches in males. In humans, the adult male larynx is about 50% larger in linear dimensions than that of the adult female, while other linear dimensions differ only by 8-9% on average. This fact supports theories that relate early language to sexual competition.

Some have argued that gestural language played a role in early forms of human language, though as we've seen, adaptations to language use seem to favor the role of vocal communication from an early period. See this article in American Scientist by Michael Corballis for a speculative discussion of the gestural theory.

Brain changes

One thing that happened to our brain was that it just got bigger. This chart (from Holloway 1996, Evolution of the Human Brain) shows that the relationship of brain weight to body weight is roughly linear on a log-log scale across a large range of primate sizes. The data point for humans is obviously above the trend line by a significant factor, meaning that our brains are surprisingly large for the size of our bodies, even for a primate.

However, the hominid brain did not just get uniformly larger. According to Holloway's discussion:

There are four major reorganizational changes that have occurred during hominid brain evolution, viz.:

(1) reduction of the relative volume of primary visual striate cortex area, with a concomitant relative increase in the volume of posterior parietal cortex, which in humans contains Wernicke's area;

(2) reorganization of the frontal lobe, mainly involving the third inferior frontal convolution, which in humans contains Broca's area;

(3) the development of strong cerebral asymmetries of a torsional pattern consistent with human right-handedness (left-occipital and right-frontal in conjunction); and

(4) refinements in cortical organization to a modern human pattern, most probably involving tertiary convolutions. (This last "reorganization" is inferred; in fact, there is no direct paleoneurological evidence for it.)

Of the four changes cited, the first three straightforwardly involve language in whole or in part.

Wernicke's area in modern humans is involved in comprehension of language.

Broca's area is involved in motor control of speech.

The cerebral asymmetries in the third point involve a localization of language skills in the dominant (generally left) hemisphere of the brain, and of other abilities (visuo-spatial and emotional) in the non-dominant hemisphere.

Like the vocal-tract changes, the brain changes have a cost. For one thing, brain tissue is expensive to maintain, about ten times more expensive than other tissue. The human brain, although only about 2% of our body weight, consumes about 20% of our energy.

For another thing, increased brain size normally translates to increased gestation period, because fetal brain tissue is laid down at a relatively constant rate. This graph shows the relationship for a dozen species from mice to elephants:

Humans are on this graph -- as the isolated point highlighted in red. If the human data point were brought in line with the trend for the rest of the species, it looks like human babies ought to be born about 17 months after conception, rather than 9. However, this would be a bad idea. Anyone who has ever given birth, or witnessed a birth, knows that an 8-month-old baby (17-9=8) just would not make it out, even if the mother could manage the extra period of pregnancy.

Instead, full-term human infants are in fact born "premature" by the standards of the rest of the animal kingdom. In fact, since development is slowed down after birth as well, human infants are not as mature as new-born chimps until they are a year old or more. Taking care of these "premature" infants imposes considerable burdens on human parents, and especially on the mother, during the first year of life.

Why language?

So a key question is, what was the source of selective pressure for language that made these trade-offs (easier choking, more energy to the brain, vulnerable babies with greater care duties for parents, etc.) worthwhile?

These drawbacks make it easier to understand why species haven't been developing sophisticated systems of symbolic communication left and right, and they force us to sharpen our thinking about what made them sufficiently worthwhile in the specific case of our ancestors.

Evolution is constantly carrying out a sort of experimental cost-benefit optimization. Our hominid ancestors, when they split off from the lineage of chimps and gorillas some 5 million years ago, might have gone on to develop built-in sonar or an improved sense of smell. They might also have stayed about the same, as indeed Homo Erectus did for almost two million years. Instead, they learned to talk. Why?

What are these physical changes in jaw, throat, and brain good for that would outweigh their many selective disadvantages?

They're definitely good for spoken language.

The redesigned vocal tract is good for making lots of different vocal sounds. The reorganized and expanded Broca's area deals with control of sound and syntactic structures. The reorganized and expanded Wernicke's area, along with the larger cortex in general, allows us to have lots and lots of words, each one connecting a meaning with a pronunciation.

Somewhere along the line, we learned to think about what others believe -- what philosophers call the "others' minds" problem -- and this made us better at communicating regardless of the medium.

But why? Why did our ancestors make such a big investment in talking? As the biological anthropologist Terrence Deacon has recently written, it's easy to think of plausible reasons:

From the perspective of hindsight, almost everything looks as though it might be relevant for explaining the language adaptation. Looking for the adaptive benefits of language is like picking only one dessert in your favorite bakery: there are too many compelling options to choose from. What aspect of human social organization and adaptation wouldn't benefit from the evolution of language? From this vantage point, symbolic communication appears "over-determined." It is as though everything points to it. A plausible story could be woven from almost any of the myriad of advantages that better communication could offer:

organizing hunts,
sharing food,
communicating about distributed food sources,
planning warfare and defense,
passing on toolmaking skills,
sharing important past experiences,
establishing social bonds between individuals,
manipulating potential sexual competitors or mates,
caring for and training young,
and on and on.

One theory, proposed in a recent book by the linguist Derek Bickerton, is that hominids invested in language so as to be able to think better. This hypothesis views rational thought as being at least in large part made up of inner speech.

Each of these ideas has some positive aspects -- the cited advantages certainly do exist to some extent. However, one may doubt how strong many of these effects could have been. For instance, in documented modern hunter-gatherer cultures, language does not play a very large role either in coordinating hunts or in teaching tool-making. In the latter case for example, people learn mostly by watching. Packs of wolves and wild dogs are extraordinarily clever at group hunting, without being able to talk about it. And many kinds of human thought do not seem to involve language at all.

The evolutionary process that got human language started -- as opposed to reasons to make it bigger, faster, or more powerful once it existed to some degree -- must have been able to accomplish something pretty special, even with the small, poor, stumbling kind of approximation to language that hominids would have been able to manage before any language-specific adaptations took place. Then, because even simple and crummy language was a big success, natural selection would have a chance to create adaptations for complex, excellent language.

Most of the recent theories that meet this test assume that the crucial selective advantages of language were social. Perhaps something about the development of language made the creation and maintenance of larger social groups possible, at a time when larger social groups were essential to survival; or perhaps language permitted a different kind of social organization, enabling our ancestors to move into a different ecological niche.

Two examples of such theories of language evolution are especially striking.

In Grooming, Gossip, and the Evolution of Language, Robin Dunbar proposes that our ancestors evolved language so as to use gossip as a more efficient substitute for the grooming behavior that other primates use to establish and maintain social relationships.

In The Symbolic Species, Terrence Deacon argues that hominid brains and human language have co-evolved over the past two million years, driven by "a reproductive problem that only symbols could solve: the imperative of representing a social contract," which in turn was required to take efficient advantage of the resources available via systematic hunting and scavenging for meat.

Grooming and gossip

Among primates, "encephalization" (brain size normalized for body size) varies in proportion to social group size. Apparently, the larger the group a primate lives in, the more brain it needs to keep track of social relationships within the group. This is plausible, given the intricate micro-politics of primate society, as documented by ethologists.

If we take the step from correlation to causation, and assume that larger brains evolved in primates in order to permit larger social groups (e.g. for better intra-species competition or better defense against predators), we have what has been called the "Machiavellian Intelligence Hypothesis."

If we look at human brain size from the perspective of this hypothesis, and extrapolate the relationship between brain size and social group size found in other primates, we predict a "natural" group size for humans of about 150 (shown by the dotted line).

In primate societies, grooming (picking nits out of fur) is a major factor in establishing and maintaining social bonds. There are interesting hypotheses about why grooming fulfills this function, but for now, we can just note that the bigger the primate group, the more time on average each member spends in grooming others. If we look at human social relations in this perspective, then with a group size of 150, we should have to spend 40% of the day spent in grooming.

This is far too high to be practical -- the highest actual proportion observed among primates is 20% (Gelada baboons).

Dunbar suggests that our ancestors, facing hard times on the African plains, very badly needed to live in larger groups. "Gossiping" (in whatever form it first arose) made it possible to form and maintain social bonds more efficiently than grooming, both because more than two can do it at once, and also because you can actually do some useful work (like gathering or processing food) at the same time. In addition, the development of sense and reference -- and especially of proper names for group members -- enabled political maneuvering at a higher level in larger groups.

The idea, then, is that as the group size increased, so did time necessarily spent grooming -- which for humans took the form of "gossip," the proposed prime cause in the development of language. Dunbar speculates that early language should have begun by the time of early Homo sapiens, about half a million years ago, at which time group size is estimated (on the basis of brain size) to be 115 or 120 individuals. That group would require 30 to 33% grooming time, certainly ripe for replacement by vocal communication. In Dunbar's view, this "vocal chatter" would have developed from the basic calls of primates, becoming more and more contentful over time -- especially regarding social interactions -- and gradually replacing physical grooming.

Language as symbolic social thinking

Deacon's argument is a complex one, depending on a number of results from ethology and other allied fields. He argues that the key point is a shift to a symbolic mode of communication, in which new linguistic tokens (i.e. words) can be created with an arbitrary relation to their meanings.

As a rule, he argues, significant changes in communicative systems in other species occur "in the context of intense sexual selection."

It is at the point in the life cycle where choice of mate takes place that evolutionary theory predicts we should find the greatest elaboration of communicative behaviors and psychological mechanisms in both pair-bonding species and polygynous species, though the communicators and the messages may differ significantly in these two extremes. Between these extremes there are many more complex mixtures of reproductive social arrangements that add new possibilities and uncertainties, and thus further intensify selection on the production and assessment of signals.

Deacon then points out that human mating arrangements, though diverse across societies, share some characteristics that make our species nearly unique: "cooperative, mixed-sex social groups, with significant male care and provisioning of offspring, and relatively stable patterns of reproductive exclusion, mostly in the form of monogamous relationships."

According to Deacon, "reproductive pairing is not found in exactly this pattern in any other species." The reason this pattern is not found, he argues, is that it's a recipe for socio-sexual disaster: "the combination of provisioning and social cooperation produces a highly volatile social structure that is highly susceptible to disintegration."

In evolutionary terms, a male who tends to invest significant time and energy in caring for and providing food for an infant must have a high probability of being its father, otherwise his expenditure of time and energy will benefit the genes of another male. As a result, indiscriminate protection and provisioning of infants will not persist in a social group when there are other reproducing males around who do not provision, but instead direct all their efforts towards copulation.

These tensions get worse if males and females spend a lot of their time apart, as necessarily happens if males are out hunting and scavenging while females are gathering plants with children in tow. "Hunting and provisioning go together, but they produce an inevitable evolutionary tension that is inherently unstable, especially in the context of group living. Besides ourselves, only social carnivores seem to live this way."

Carnivores that engage in cooperative group hunting include wild dogs, wolves, hyenas, lions, and meercats. All such creatures exhibit particular ecological and reproductive patterns that defuse the resulting evolutionary tension.

Among lions, provisioning takes place among a "pride" of closely-related females (sisters, aunts, etc.). One, two, or rarely three male lions take over a pride and guard it against other males -- who will try to kill the cubs to bring the females into estrus -- but do not provide food.

Among wild dogs and wolves, the cooperative hunting pack includes both males and females, and they provision both pups and a nursing mother. However, in a given pack there is usually only one reproducing female, who is typically the mother of many of the hunters. Other females are kept from becoming sexually receptive by social pressures and perhaps pheromones. There is usually also only one reproductively active male in a pack.

The typical human pattern involves many reproductively active males and females living in a group while maintaining patterns of sexual exclusivity, with male provisioning of children although mated males and females spend considerable time apart, is never found among the social carnivores.

Deacon suggests that this background helps to explain why the evolution of systematic hunting as a major food source for our hominid ancestors posed a difficult problem in social engineering.

The acquisition and provisioning of meat clearly would be a better strategy for surviving seasonal shortages of more typical foods than shifting to nutrient-poor diets of pith, bark, and poor-quality leaves, as do modern chimpanzees. But this is only possible if there is a way to overcome the sexual competition associated with paternity uncertainty. The dilemma can be summarized as follows: males must hunt cooperatively to be successful hunters; females cannot hunt because of their ongoing reproductive burdens; and yet hunted meat must get to those females least able to gain access to it directly (those with young), if it is to be critical subsistence food. It must come from males, but it will not be provided in any reliable way unless there is significant assurance that the provisioning is likely to be of reproductive value to the provider. Females must have some guarantee of access to meat for their offspring. For this to evolve, males must maintain constant pair-bonded relationships, and yet for this to evolve, males must have some guarantee that they are provisioning their own progeny. So the socio-ecological problem posed by the transition to a meat-supplemented subsistence strategy is that it cannot be utilized without a social structure which guarantees unambiguous and exclusive mating and is sufficiently egalitarian to sustain cooperation via shared or parallel reproductive interests.

For hunting and provisioning to co-exist in large groups of reproductively active hominids, Deacon argues, it was necessary to establish a certain sort of social contract. If this contract can be established and maintained, then everyone is better off. However, it will not work until nearly everyone observes the terms and also enforces observance among others.

Essentially, each individual has to give up potential access to most possible mates so that others may have access to them, for a similar sacrifice in return.

Accomplishing this requires two things. First, you have to establish a shared understanding of who is bonded with whom. According to Deacon, "this information can only be given expression symbolically", because it "is a prescription for future behaviors," not just a memory or an index of past behavior, or an indication of current social status or reproductive state, or even a prediction of probable future behavior.

The pair-bonding relationship in the human lineage is essentially a . . . set of promises that must be made public. These . . . implicitly determine which future behaviors are allowed and not allowed; that is, which are defined as cheating and may result in retaliation.

Second, you have to get everyone else that might be involved to agree not to cheat, and to help protect against cheating.

For a male to determine he has . . . paternity certainty, requires that other males also provide some assurance of their future sexual conduct. Similarly, for a female to be able to give up soliciting provisioning from multiple males, she needs to be sure that she can rely on at least one individual male who is not obligated to other females to the extent that he cannot provide her with sufficient resources.

A marriage contract is a social contract, not just an agreement between the bonded pair. It is typical in human societies for the social group as a whole to play an active part in maintaining sexual exclusivity between individuals; this is something that happens in no other species. Deacon argues that it happens among humans because all members of the group "are party to the social arrangement, and have something to lose if one individual takes advantage of an uncondoned sexual opportunity."

To sum up: Deacon thinks that early hominids developed symbolic communication as a way to establish social contracts permitting stable family and group structures, which otherwise would not have permitted hunting and scavenging for meat as a systematic source of supplemental food during times of drought. This set the stage for nearly two million years of evolutionary adaption for improved symbolic communication, probably due to sexual selection (crudely, females preferred males who could make more convincing promises).

Note that Dunbar and Deacon might both be right: perhaps the development of gossipy chatter as an extension of grooming behavior created a basis for symbolic reference -- maybe originally involving personal names -- that in turn opened the way for the kind of public "contracts" about mating that Deacon sees as crucial to permit systematic hunting.

Needless to say, any such proposals are speculative, although Deacon and Dunbar provide a considerable range of supporting fact and argument. There appears to be fairly general current agreement, at least, that humans are extensively adapted for language, and that establishment and maintenance of social structure was a key source of selective pressure in the evolutionary development of human linguistic adaptations.

"Spandrel" theories

Another perspective on the initial development of language treats it as a sort of accidental side-effect of larger brains, which on this view developed for some other reason (say, to facilitate tool use and/or social dynamics). From this point of view, there's no need to find a specific selectional pressure for language.

This "side-effect" theory would be an example of what Stephen Jay Gould has called evolutionary spandrels. The original meaning of "spandrel" is a space between two arches and a horizontal cornice above them; this space began as an accidental (but unavoidable) consequence of architectural techniques based on the use of arches and domes; because this accidental space is a convenient place to put paintings or other ornamentation, it developed into a planned part of buildings with a specific function.

Gould argues that many evolutionary developments are of this kind -- some feature arises as an accidental side-effect of another change, but then turns out to be useful and comes to be itself shaped by selective pressures.

This spandrel theory is not inconsistent with other accounts of the selective pressures for language development: larger brains, already (partly) evolved for non-communicative purposes, were probably a crucial precondition as the canvas on which the pressures for linguistic communication could have their effect.

More on spandrels: Gould and Lewontin on spandrels; Pinker and Bloom on language as a spandrel; general discussion of evolutionary spandrels.

What were the steps in the process?

On any account of the selective pressures leading to human genetic specialization for spoken language, we may still owe a separate explanation of where the basic behaviors came from. Thus ears are for hearing, and their selective advantages presumably have to do with gaining information about the environment from sound; as a separate matter, it happens to be true that the bones of the mammalian inner ear developed from parts of the reptilian jaw.

For example, we can cite the theory that speech developed out of song. On this view, song-like vocal displays came first, perhaps with a function in sexual selection. Like music, they involved complex patterns but had no specific meaning. Certain "motifs" or bits of vocal pattern came to have referential value, for instance in naming individuals. Even here, however, we must remember that processing of music is localized in the right hemisphere, while most language functions are in the left.

Unlike bones, however, behaviors leave little evidence in the fossil record. Since no other species has developed a symbolic communication system like human language, we are not in a good position to make generalizations, except about the many cases where symbolic language did not develop. Therefore, while many interesting ideas have been proposed, it is difficult to make a strong case for or against the various theories of the evolutionary precursors and selective advantages of human language.

The Pinker reading has some detailed discussion on this issue, mainly dedicated to debunking certain criticisms of evolutionary theories of language development that arise from these questions. He points out, for example, that the fact that chimps apparently lack language does not mean that language arose without intermediate stages. Chimps are not our ancestors, they are our cousins, and there have been about 5 million years of evolution since our last common ancestor with the chimps lived. That's plenty of time for natural selection to do some amazing things.

Sources of additional info on language acquisition: on-line lecture notes for a course on language acquisition at Lancaster University can be found here. A good starting point for more information about child language acquisition is the CHILDES web site at CMU, where you can find out about downloading the raw materials of child language research, and also search a specialized child language bibliography.

For your amusement: See this article from the satirical newspaper The Onion, entitled "Study Reveals: Babies are Stupid." Not recommended for the overly sensitive. Of course we've seen plenty of evidence today of just how much babies can do! What's amazing is that, while they fail so miserably at adult tasks like those discussed in the Onion article, they are unequalled geniuses at language learning.

home

schedule

homework

[Ling 001 Homepage] [Class Schedule]

Ling 001 Lecture 1 Introduction to Language and Linguistics

Ling 001 Lecture 1 Introduction to Language and LinguisticsHW

Ling 001 Lecture 2 Phonetics-Phonology

Ling 001 Lecture 3 Morphology

Ling 001 Lecture 4 Syntax

Ling 001 Lecture 5 Semantics

Ling 001 Lecture 5 SemanticsPragmatics

Ling 001 Lecture 7 Historical Linguistics and Linguistic Typology